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We propose a way of obtaining effective low energy Hubbard-like model Hamiltonians from ab initio quantum Monte 
Carlo (QMC) calculations for molecular and extended systems. The Hamiltonian parameters are fit to best match the 
ab initio two-body density matrices and energies of the ground and excited states, and thus we refer to the method as ab 
initio density matrix based downfolding (AIDMD). For benzene (a finite system), we find good agreement with experi¬ 
mentally available energy gaps without using any experimental inputs. For graphene, a two dimensional solid (extended 
system) with periodic boundary conditions, we find the effective on-site Hubbard U*/t to be 1.3 ± 0.2, comparable to 
a recent estimate based on the constrained random phase approximation. For molecules, such parameterizations enable 
calculation of excited states that are usually not accessible within ground state approaches. For solids, the effective 
Hamiltonian enables large-scale calculations using techniques designed for lattice models. 


I. INTRODUCTION 


The reliable simulation of systems for which the large-scale 
physics is not well-approximated by a non-interacting model, 
is a major challenge in physics, chemistry, and materials sci¬ 
ence. These systems appear to require a multi-scale approach 
in which the effective interactions between electrons at a small 
distance scale are determined, which then leads to a coarse¬ 
grained description of emergent correlated physics. This re¬ 
duction of the Hilbert space is often known as ’’downfolding”. 
In strongly-correlated systems, the correct effective Hamil¬ 
tonian is strongly dependent on material-specific properties, 
motivating the need for a generic accurate method to deter¬ 
mine it. 

One can loosely categorize downfolding techniques into 
two strategies. The first strategy is based on performing ab 
initio calculations and then matching them state by state to 
the effective model. Alternately, some approaches employ a 
model for the screening of Coulomb interactions, for which 
the ab initio single particle wavefunctions provide the relevant 
inputs. For the purposes of this manuscript, the umbrella term 
we use for these strategies is ’’fitting”. Techniques that fall 
into this class include the constrained density functional the¬ 
ory*’^, the constrained random phase approximation (cRPA)^, 
fitting spin models to energies"^’^, and efforts by one of us® us¬ 
ing reduced density matrices of quantum Monte Carlo (QMC) 
calculations. The second class is based on Lowdin down- 
folding^^® and canonical transformation theorywhich 
involves a sequence of unitary transformations on the bare 
Hamiltonian, chosen in a way that minimize the size of the 
matrix elements connecting the relevant low energy (valence) 
space to the high energy one. 

Downfolding by fitting has the advantage that it is con¬ 
ceptually straightforward to perform, although it demands an 
a priori parameterization of the effective Hamiltonian. The 
methods have been applied to complex bulk systems 
but it is only recently that their accuracy is being rigorously 
checked^'. On the other hand, canonical transformations do 
not need such parameterizations and can discover the rele¬ 
vant terms in an automated way. However, their application 
to complex materials remains to be carried out and tested. 


Here we propose a scheme which aims to capture the ’’best 
of both worlds”. On the one hand, we retain the simplicity 
of fitting and on the other we use information from accurate 
many-body wavefunctions to determine which terms are im¬ 
portant. The deviations between the ab initio and model prop¬ 
erties allows us to assess the quality of the resultant model 
and to discover relevant physics from the calculation. Simul¬ 
taneously, the method cannot depend too much on the quality 
of the ab initio solution because of the inherent limitations of 
accuracy, especially for very big system sizes. 

Once an effective model Hamiltonian in the reduced Hilbert 
space is obtained, as is depicted in Fig. 1, it can be used to 
perform a calculation on a larger system using techniques de¬ 
signed for lattice models^^^^®. This multi-step modeling pro¬ 
cedure is needed since the ab initio calculations for a given 
system size are, in general, computationally more expensive 
than the equivalent lattice calculations. Large sizes are crucial 
to study finite size effects, and in turn theoretically establish 
the presence of a phase. In addition, excited states and dynam¬ 
ical correlation functions have traditionally been difficult in 
ab initio approaches, but have seen progress for lattice model 
methods^®^^^. 
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Figure 1. Schematic for downfolding. The full Hamiltonian H is 
defined in the space of active (partially occupied), core (mostly oc¬ 
cupied) and virtual (mostly unoccupied) orbitals. These orbitals have 
been arranged according to their energy (E) in the figure. The objec¬ 
tive is to map the physics of the original system to that of the effective 
one, defined only in the active space, with Hamiltonian H. 
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In this paper, we demonstrate a downfolding method that 
uses data from ab initio QMC techniques to derive an effective 
coarse-grained Hamiltonian. This method, which we call non¬ 
eigenstate ab initio density matrix downfolding (N-AIDMD), 
uses many-body simulations of non-eigenstates to fit an effec¬ 
tive low-energy Hamiltonian. We demonstrate that the method 
can use wavefunctions of medium quality to derive highly ac¬ 
curate effective Hamiltonians. After demonstrating a simple 
example, we downfold benzene from a 30-electron problem 
to a 6-electron one and show that the resulting Hamiltonian 
reproduces the experimental spectrum well. We also show 
that the method works for extended systems, by applying it to 
graphene. 


II. METHODS 

In the present section we discuss the methods we used to 
generate our ab initio data. While most of our discussion is 
specific to QMC, the quantities used can also be calculated 
in almost any other wavefunction-based quantum chemistry 
method. 


A. Variational and Fixed Node Diffusion Quantum 
Monte Carlo 


determinant part of the trial wavefunction. For molecules, a 
multi-determinantal wavefunction is generated by performing 
a configuration interaction with singles and doubles excita¬ 
tions (CISD) calculation from the reference Slater determi¬ 
nant. Once this is done, a Jastrow factor is introduced, 
resulting in the ansatz for the trial wavefunction ipT, 

ipTiri,r2,....rN) = j'^diDi ( 1 ) 


where (ri, 7 - 2 , ....r-Ar) refers to the coordinates of the electrons 
(the spin indices, being fixed, have been suppressed), Di are 
determinants and di their corresponding coefficients. When 
we desire eigenstates, the parameters in the Jastrow J and 
the coefficients di are optimized to get the best possible varia¬ 
tional energy within the ansatz chosen using a technique intro¬ 
duced by Umrigar and coworkers"^^ with an efficient algorithm 
by Clark et al."^®. 

The variational energy is calculated via Metropolis sam¬ 
pling of IV’Tp, 

_ {iPtIHI'iPt) _ /|V^t(R)P dK 

/|V'T(R.)PdR 

where R is a compact notation for the coordinates of the elec¬ 
trons and iF^7’(R)/7/>7’(R) is the ’’local energy”. With this 
trial wavefunction we perform FN-DMC to calculate the en¬ 
ergy using the mixed (or projected) estimator. 


Ab initio QMC comprises of a suite of methods that ef¬ 
ficiently sample the phase space of N electrons each mov¬ 
ing in 3-dimensional real space. When the wavefunction as a 
function of 3N coordinates is known, the phase space can be 
sampled with variational Monte Carlo (VMC) using Metropo¬ 
lis algorithms. For ground state calculations, the Diffusion 
Monte Carlo (DMC) method, based on imaginary time evo¬ 
lution of the Schroedinger equation, is formally exact but in 
practice limited by the numerical sign problem. This prob¬ 
lem ceases to exist if one knew the exact location of the nodes 
(zeroes) of the many-body wavefunction. Thus, the optimal 
strategy for very accurate calculations is to generate a good 
trial wavefunction, and optimize its parameters to minimize 
its variational energy. Then, we use this wavefunction to ’’fix 
the nodes” (which may only approximately correspond to the 
exact nodes) and perform a DMC calculation under this con¬ 
straint. This last variant is called the fixed-node DMC (FN- 
DMC) method and is known to be very accurate for a large 
variety of systems. While some more details are discussed 
here, we refer the interested reader to Ref. 39 for an exhaustive 
review of concepts and applications. All the ab initio QMC 
calculations were carried out with the QWalk package"^®. 

A typical QMC calculation was carried out as follows. 
First, we perform DFT calculations with the B3LYP"^' or PBE 
functionals^^ using GAMESS"^^ for molecules or CRYSTAL"^^ 
for solids. The lowest energy DET orbitals provide the Slater 


W'W't) f^,T(R»’(R)dR '' 

where ip = exp(—/3iF)'07’ is obtained by a stochastic projec¬ 
tion of ipT under the constraint that ip and ipT have the same 
sign everywhere. 

We now discuss measurements in the QMC methods. Eor a 
generic operator O, the pure (VMC) and mixed estimators are 
computed as. 


{0)vMC = 


{iPt\0\iPt) 

(V’tIV't) 


_ {ip\d\ipT) 

~ {'4’Ht) 


(4) 


The mixed estimator of an operator is equal to the pure esti¬ 
mator in two cases; (1) when ipr is the exact wavefunction or 
(2) when the operator O commutes with the Hamiltonian. In 
more general situations, higher accuracy can be obtained with 
the extrapolated estimator"^^ for approximate eigenstates, 

(^)extrap ~ 2(0)jnix (O)vMC (5) 


Eor accurate wavefunctions, all these estimators must ap¬ 
proach the same value. 

We will construct effective Hamiltonians using the two- 
body reduced density matrix (2-RDM) elements, given by the 
estimator (the normalization has been omitted). 
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Pzjki = {clc]ciCk) = X! / {'R)dr'^dr'f,dli, 

a^b 


( 6 ) 


where R"jj = (ri, r 2 , ....rTv) refers to the set of elec¬ 

tron coordinates obtained by changing the location of two 
electrons and (j)i (r) are a chosen set of one-particle wavefunc- 
tions (orbitals) indexed by label i(j, k, /). The mixed estima¬ 
tor equivalent of Eq. (6) is obtained like that for the energy. 
More details of this computation have been previously dis¬ 
cussed elsewhere by one of us®. The chosen set of orbitals is 
often localized on the atoms; this property makes it convenient 
to derive Hubbard-like models. We explain their construction 
next. 


B. Localized orbitals 

Localized orbitals often provide an intuitive way of under¬ 
standing an electronic system in terms of electron hops and 
on-site or inter-site repulsions. Thus, many works have been 
devoted to this subject; ranging from the Linearized Muffin- 
Tin Orbital (LMTO) method"^* to the maximally localized 
Wannier function construction"^®. Orbital localization has also 
been widely discussed in the quantum chemistry literature. 



Figure 2. Amplitude isosurface contour for (left) one of the six sym¬ 
metry equivalent tt orbitals of the benzene molecule, obtained by 
localizing three bonding and three anti-bonding orbitals and (right) 
a representative localized orbital for the 4x4 unit cell of graphene. 
The colors indicate the sign of the single particle wavefunction. 

The idea is to first select a set of orbitals in a certain energy 
window. Lor solids these correspond to bands close to the 
Lermi level, and for molecules these are partially occupied or¬ 
bitals which constitute the active space. Then, a unitary trans¬ 
formation is performed to optimize a pre-decided metric for 
localization. In this work, we minimize the spread S, 

n 

where ’pni'f) are the desired localized orbitals related to the 
chosen set of orbitals by a unitary transformation, 

4>n[r) = Uni^^ir)■ 

Lor some systems, as we will see in the case of benzene and 
graphene, schematically shown in Lig. 2, it is necessary to in¬ 
clude unoccupied orbitals to get well-localized orbitals of the 
right symmetry®®. Thus the construction of localized orbitals 


is not a black-box procedure and may need adaptations based 
on the specifics of the system. 

We note that the optimized parameters of the effective 
Hamiltonian may, in general, depend on the choice of local¬ 
ized orbitals. However, we have not explored this dependence 
- our main objective in this paper is to assess the validity of 
model Hamiltonians with respect to ab initio calculations for 
a particular choice of single particle basis. 

C. Lattice model calculations 

The lattice model calculations for Hubbard models of ben¬ 
zene and graphene at half-filling were carried out with a com¬ 
bination of our own codes and the freely available QUEST 
determinantal quantum Monte Carlo package®'. Lor the hon¬ 
eycomb lattice half-filled Hubbard model, the determinantal 
QMC method is sign problem free and the results are exact up 
to statistical errors. A time step of 0.1 was chosen and /3 (the 
imaginary time) was set to 20 for every calculation. Measure¬ 
ments were performed for 5000 sweeps, with an additional 
2000 sweeps being used for equilibration. 


III. CRITERIA FOR A LOW ENERGY EFFECTIVE 
HAMILTONIAN 

Our aim is to obtain a low energy effective Hamiltonian de¬ 
fined in the active space of electrons. In this basis, the criteria 
it must satisfy are: 

(a) The reduced density matrices (RDM) of the ground and 
excited states obtained from the ab initio calculation must 
match with that of the model calculation. 

(b) The energy spectra of the ab initio and model systems 
must match in the energy window of interest. 

(c) The model must be detailed enough to capture the es¬ 
sential physics and yet simple enough to avoid over¬ 
parameterization and over-fitting. 

The concept of matching ROMs [criterion (a)] has previ¬ 
ously appeared in related contexts®’®^’®® and in work by one 
of us®. Most physical properties, such as the charge and 
spin structure factors, are functions of the 2-RDM. Since it 
is computationally expensive to calculate high-order RDMs, 
we use the matching condition only on the 2-RDM, pijki = 
(cjcjc;cfe) where i,j, k, I are orbital indices (including space 
and spin). This criterion automatically ensures that the com¬ 
bined number of electrons occupying the orbitals is equal to 
those in the model Hamiltonian. If any input state does not 
have the expected electron number in the active space, it can 
not be described by the effective Hamiltonian. 
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The importance of excited state energies used in the fitting 
[criterion (b)] is highlighted by the fact that the wavefunc- 
tions, and their corresponding two-body density matrices, are 
invariant to many kinds of terms that enter the Hamiltonian. 
For example, the transformation, 

H' ^ H + aS^ + Pn + ^S'^n ( 8 ) 

is, by construction, consistent with all the 2-RDM data for 
any a, P, 7 for systems which have definite spin symmetry 
and particle number. Imposing certain physical constraints 
on the form of the interactions can reduce the need for this 
criterion. To give a concrete example, consider wavefunc- 
tion data generated from the ground state of an unfrustrated 
Heisenberg spin Hamiltonian on a bipartite lattice H = 
J'Liij) Si ■ Sj. where {i,j) refer to nearest neighbor pairs 
and Si is the spin operator on site i. Then adding aS'^ gives 
the same reduced density matrices for the ground state, as long 
as a is small enough to not cause energy crossings i.e. not 
make an original excited state the new ground state. This ad¬ 
ditional term has the effect of introducing long-range Heisen¬ 
berg couplings. Moreover, the effective Hamiltonian is not 
unique; the Lieb-Mattis model^^ H = Sa ■ Sb (where A(B) 
and Sa(b) refer to the sublattice and corresponding spin), is 
also known to reproduce the low-energy limit of the Heisen¬ 
berg model. Thus, imposing the requirement that the Hamil¬ 
tonian has the nearest-neighbor form constrains a to zero and 
picks one particular model. Similar arguments should apply to 
extended Hubbard models in homogeneous systems where a 
physical constraint is that the density-density interaction must 
decrease monotonically with distance between orbitals. 


IV. AB INITIO DENSITY MATRIX BASED 
DOWNFOLDING (AIDMD) PROCEDURES 

We now discuss two procedures that use density matrices 
and energies to calculate parameters entering the effective 
Hamiltonian; both have been schematically depicted in Fig. 3. 


A. Eigenstate AIDMD method 

In the first method, schematically depicted in Fig. 3(a), 
eigenstates from an ab initio calculation are used to match 
density matrices and energies of the corresponding model. 
The QMC extrapolated estimator is taken to be an accurate 
representation of the true one. Then the parameters of the 
model Hamiltonian are obtained by minimizing a cost func¬ 
tion that is a linear combination of the energy and density ma¬ 
trix errors, 

E ((cic]Qc.):-(cic|.Qc,)r)^ 

(9) 

where the subscript s is an eigenstate index, i,j, k, I are or¬ 
bital indices and the superscripts a and m refer to ab initio and 
model calculations respectively. There is no definite prescrip¬ 
tion for choosing the weight /; a good heuristic is to choose 


a value that gives roughly the same size of errors for the two 
terms that enter the cost function. The cost minimization is 
performed with the Nelder Mead simplex algorithm. 

In practice we found that since the number of available 
eigenstates and the accuracy of true estimators is limited, a 
second method discussed next is more suited for downfold¬ 
ing. 


B. Non-eigenstate AIDMD method 

Consider a set of ab initio energy averages Eg, i.e. expec¬ 
tation values of the Hamiltonian, and corresponding 1- and 
2-RDMs (c|cj)s, {clcjCiCk)s for arbitrary low-energy states 
characterized by index s. Assume a model 2-body Hamilto¬ 
nian with effective parameters tij ( 1 -body part) and Vij^k,i 
(2-body part) along with a constant term C; the total number 
of parameters being Np. Then for each state s, we have the 
equation. 

Eg = {H)g = C + Y,Uj{clcj) + J2yvki{clc]ciCk) (10) 

ij ijkl 


where we have made the assumption that the chosen set of 
operators corresponding to single particle wavefunctions or 
orbitals, explain all energy differences seen in the ab initio 
data. The constant C is from energetic contributions of all 
other orbitals which are not part of the chosen set. 

We then perform calculations for M low-energy states 
which are not necessarily eigenstates. These states are not 
arbitrary in the sense that they have similar descriptions of the 
core and virtual spaces. Each state satisfies the criteria (1) 
its energy average does not lie outside the energy window of 
interest and (2) the trace of its 1-RDM matches the electron 
number expected in the effective Hamiltonian. 

The objective of choosing a sufficiently big set of states is to 
explore parts of the low-energy Hilbert space that show vari¬ 
ations in the RDM elements. Since the same parameters de¬ 
scribe all M states, they must satisfy the linear set of equa¬ 
tions. 


( El \ 

E 2 

Ea 


\ Em J 


( 1 (clcj)i 

1 {c\cj)2 

1 (4cj)3 
1 
1 
1 

\1 (c1cj)m 


(cJ4ciCfc)i 

(c|4QCfc)2 

(c|4ciCfc)3 

(cJc]ciCfc)4 


(4c]c;Cfc) 


M 


^ijkl 

V .. y 


(11) 


which is compactly written as. 


E = Ax 


( 12 ) 


where E = (Ei, E 2 , ...Em)'^ is the M dimensional vector of 
energies, A is the M x Np matrix composed of density matrix 
elements and x = is a Np dimensional 
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C. Generating states for AIDMD methods 
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Figure 3. Schematic of ab initio density matrix downfolding 
(AIDMD) methods employed for determining the effective Hamilto¬ 
nian parameters, (a) In the eigenstate (E)-AIDMD, the reduced den¬ 
sity matrices and energies of eigenstates of the model are matched 
to the ab initio counterparts, (b) The non-eigenstate (N)-AIDMD 
method uses RDMs and energies of arbitrary low-energy states to 
construct a matrix of relevant density matrices and performs a least 
square fit to determine the optimal parameters. 


vector of parameters. This problem is over-determined for 
M > Np, which is the regime we expect to work in. 

In the case of any imperfection in the model, which is the 
most common case, the equality will not hold exactly and one 
must then instead minimize the norm of the error, Af: 

Ar= Px-E||=^ (13) 

This cost function can be minimized in a single step by using 
the method of least squares, employing the singular value de¬ 
composition of matrix A. This matrix also encodes exact (or 
near-exact) linear dependences. Thus, the quality of the fit is 
directly judged by assessing (1) the singular values of the A 
matrix and (2) the value of the cost function itself i.e. the de¬ 
viations of the input and fitted energies. We will refer to this 
as the non eigenstate (N)-AIDMD method throughout the rest 
of the paper. This idea is schematically depicted in Fig. 3(b). 

The matrix A gives a very natural basis for understanding 
renormalization effects. For example, consider a set of wave- 
functions that show that the correlator {niUj) does not change 
significantly. This would lead to the corresponding column of 
matrix A being identical (up to a scale factor) to the first col¬ 
umn of I’s. Physically, this would correspond to the coupling 
constant Vijji being irrelevant for the low-energy physics; it 
can take any value including zero and can be absorbed into the 
constant shift term. This could also alternatively mean that the 
input data is correlated and does not provide enough informa¬ 
tion about Vijji, so care must be taken in constructing the set 
of wavefunctions. 

In summary, the N-AIDMD method performs the following 
operation. The 1- and 2-RDMs and energy expectation values 
of many non-eigenstate correlated states are calculated. Then, 
given an effective Hamiltonian parameterization, linear equa¬ 
tions (12) are constructed and solved. Standard model fitting 
principles apply, and we can evaluate the goodness of fit to de¬ 
termine whether a given effective Hamiltonian can sufficiently 
describe the data. 


We now address the central issue of generating states to be 
used as inputs for the AIDMD methods. 

For the E-AIDMD, the near-eigenstates were obtained by 
performing CISD calculations with multiple roots and opti¬ 
mizing a multi-determinant Jastrow wavefunction with each 
CISD guess as a starting point. This is known to be approx¬ 
imate, especially for higher excited states. It is the inherent 
uncertainty about the accuracy of this procedure, along with 
the fact that only a small number of eigenstates are accessible, 
that limits the utility of E-AIDMD. 

Erom the point of view of the N-AIDMD method too, au¬ 
tomating the construction of the database of wavefunctions 
may not be completely straightforward and here we offer 
some heuristics for doing this within QMC methods. We re¬ 
emphasize that any state described by the effective Hamil¬ 
tonian must be one that does not involve large contributions 
from the core and virtual orbitals i.e. single particle degrees 
of freedom outside the chosen active space. This check can 
be imposed at the ab initio level by monitoring the RDMs, for 
example, the trace of the 1-RDM taken over the active orbitals 
must equal the number of electrons in the effective Hamilto¬ 
nian description. 

One way to generate new states is to perturb near¬ 
eigenstates. Eor example, after optimizing the multi- 
determinantal-Jastrow trial wavefunction, we artificially 
change the determinantal coefficients by small amounts. This 
procedure changes the nodal surface and gives energies close 
to, but different from, the optimized ground state. A sec¬ 
ond source of data is spin excitations of the DET reference 
Slater determinant, generated within the space of orbitals that 
play an important role in the active space; for benzene and 
graphene these involve the Kohn-Sham orbitals with tt sym¬ 
metry. Einally, in the case of extended systems, we chose a 
linear combination of determinants which, in spite of being 
not size-extensive, reveal additional properties of the effective 
Hamiltonian. 


D. Quantum Monte Carlo specific adaptation 

The formalism introduced above applies to any method that 
calculates energies and density matrices. In this paper, all the 
expectation values entering the A matrix are calculated for the 
chosen low-energy wavefunctions by Monte Carlo sampling. 

Once this database of states has been generated, we per¬ 
form two independent calculations to estimate the parameters, 
one using variational and the other using mixed estimators. In 
the latter case, we modify the linear equations (12) by using 
Es = and the projected estimates of the density 

matrix elements i.e. ('0r*|c|cjj'0®) and (V't*| c-cjc;Cfe|'0®) in 
the construction of the A matrix. The implicit normalization 
of these mixed estimates by is assumed. This pro¬ 

jector formulation is also amenable to coupled-cluster calcu¬ 
lations which work with projected energies and density matri¬ 
ces. 
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The bias arising out of fixing the nodes of the projected 
wavefunction does not affect our formulation. This is because 
the method regards as some arbitrary positive sampling 
function associated with a low energy state, and it is this same 
distribution that is used for the evaluation of the density matrix 
elements. Each such distribution provides a linear equation 
encoding the relationship between the FN-DMC energy and 
the projected density matrix elements and the unknown pa¬ 
rameters. Up to errors from statistical uncertainties and from 
the assumption of the form of the Hamiltonian, this relation¬ 
ship is exact. 

However, the value of the optimal parameters does de¬ 
pend on the choice of method. For example, for the benzene 
molecule presented in section VI, the VMC and FN-DMC pa¬ 
rameter values agree with each other up to 10%; the largest 
discrepancy is due to different constant terms. This discrep¬ 
ancy is expected, because VMC does not provide an accurate 
description of the core and virtual spaces. 

Ideally, only the FN-DMC calculations should be used to 
estimate the parameters. However, the mixed estimator in FN- 
DMC is biased because of the inaccuracy of the trial wave- 
function. Thus, we propose a better estimator for the parame¬ 
ters. 


p = 2pD-pv (14) 

where p is the true parameter, and pn and pv are the corre¬ 
sponding parameters obtained from FN-DMC and VMC cal¬ 
culations. The details of this result are explained in Appendix 
A. 


V. SIMPLE APPLICATION: HUBBARD TO 
HEISENBERG MODEL 

To demonstrate our formalism for a simple example, we 
consider the two site Hubbard model and fit information from 
the lowest two states to a Heisenberg model. 

We analytically solve for all four eigenstates of the Hamil¬ 
tonian, 


H = cl^Cj^a + h-c. + U'^ (15) 

ij i 

for two opposite spin electrons on two sites, where t is the 
hopping, U is the Hubbard on-site interaction. The Hilbert 
space on a single site (orbital) is spanned by four states |0) 
(unoccupied), | f) (single up occupied), | j,) (single down 
occupied), | t4-) (doubly occupied). For completeness, we 
discuss some features of the solution method below. 

First notice that the triplet state with 

energy Et = 0 and the state |V’d) = energy 

Ed = U, are exact eigenstates of the problem independent of 
the values of t and U. 

To get the other two states, write the Hamiltonian in 
the basis of \ijjs) = (I t i) + U t)) and ) = 


^ (I n 0) + |0 n)), 

" = ( -2t ) (X-) 

Then diagonalizing it, we get the energies to be. 

The lowest energy corresponds to the singlet, E- and the 
corresponding eigenvector is 


IV'-) = 


2t 


li’s) - 


E_ 




--\^d ) (18) 




with the next excited state being the triplet \4’t)- 
We choose the Heisenberg form to fit to 

H = C + JSi- S 2 (19) 


To determine the parameters C and J, form the 2x2 A matrix 
with the lowest two energy states. 


E_ 

Et 


1 

1 


4-r(E_/t)2 

-3/4 



( 20 ) 


Using derived values of Es and Et = 0, we get, 

E,[A+{E_/tf) 

A+l{E./tY 


( 21 ) 


which to lowest order in f/(7 is J = —At^/U. 

Observe that the correlator for {Si ■ Sj) is not exactly 1/4 
but only approximately so. This is expected since the fluc¬ 
tuations from the high-energy states are not exactly zero, if 
it were, it would be equivalent to exactly block-diagonalizing 
the Hamiltonian. This exact block diagonalization is not pos¬ 
sible in general, unless it is also accompanied with a change 
in the low energy degrees of freedom entering the model. 

If we now rotate the two low energy eigenstates and define 
the orthogonal linear combinations, 

lij'-t) = p\ip-) + q\ipt) (22a) 

l'02) ='/IV'-)-pIV't) (22b) 


and calculate their energies and construct the corresponding A 
matrix, we get J to be independent of p and q. This property 
is desirable as the method does not hinge on the requirement 
of eigenstates as inputs. 

In terms of canonical transformations (here equivalent to 
second order perturbation theory), the matrix element of the 
effective Hamiltonian between the singly occupied states is. 


(u |H| w = E w 

n ^ 

_ - 2/2 

“ ~ir 


(23) 


Since this matrix element must equal J/2 in the Heisenberg 
model we arrive at the same result, namely J = —At^/U^^. 
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VI. APPLICATION TO A MOLECULE: BENZENE 

We show the workings of the described AIDMD methods 
for the benzene molecule. Our choice of system is motivated 
by its simplicity (one band model and presence of many sym¬ 
metries) and the availability of experimental energies to com¬ 
pare to. 

DFT calculations were first performed in the TZVP ba¬ 
sis with Burkatzki-Filippi-Dolg pseudopotentials^^, using the 
molecular geometry that corresponds to a B3LYP optimized 
calculation. These serve as a starting point for the QMC calcu¬ 
lations, discussed in the Methods section. In the charge neu¬ 
tral sector, there are a total of 30 electrons (for example, 15 
t and 15 I for the spin-singlet state) and our objective is to 
downfold this system to an effective one with 6 electrons (3t, 
3i). 

The model Hamiltonian is defined in the space of six tt or¬ 
bitals; a representative localized orbital has been shown in 
Fig. 2. These orbitals were obtained by localizing the high¬ 
est three occupied and the lowest three unoccupied B3LYP 
orbitals (from the S = 0 DFT calculation) with tt orbital sym¬ 
metry, a well established procedure in the literature^®. The 
overall phase of these orbitals is adjusted to enable use of pa¬ 
rameter symmetries directly when fitting. 

Appendix B discusses the QMC data used for fitting and the 
initial pre-processing to determine the eligibility of states that 
can be described by a six-7r orbital Hamiltonian. 


A. On-site Hubbard model 

We consider the Hubbard model for six orbitals of benzene, 
given by Eq. (15), where t is used for the nearest-neighbor 
hopping and U* is the effective on-site Coulomb repulsion. 
We will discuss multiple ways of using reduced density matrix 
elements to estimate these parameters. 


1. Hubbard U*/t from the E- AIDMD method 

An estimate of U*/t is obtained by directly matching the 
half-filled ground state (S = 0) 2-RDM element correspond¬ 
ing to the ’’double occupancy” correlator {{n^nf}) of the ab 
initio and lattice-model calculations. This element equals 0.25 
for the non interacting case (U* = 0) and its value reduces for 
U* > 0. 

Fig. 4 shows the dependence of (n-j-nj,), computed in the 
ground state of the Hubbard model of a six-site ring at half 
filling, on U*/t. The plot also indicates the value of this cor¬ 
relator computed from various wavefunctions and estimators 
from ab initio QMC calculations of the benzene molecule. 
The trends are consistent with our expectations; the Slater- 
Jastrow (SJ) wavefunction at the VMC level underestimates 
the strength of the effective interactions, which is partly reme¬ 
died by the extrapolated estimator from FN-DMC. However, 
the bias (systematic error) is expected to be large because of 
the considerable difference in the two estimates. This bias 



Figure 4. Double occupancy correlator of a single tt orbital of a six- 
site ring Hubbard model, as a function of U* jt, computed in the 
half-filled singlet ground state. Comparisons are made with the val¬ 
ues from the variational (VMC) and extrapolated (Ext) estimators 
obtained from ab initio QMC calculations for the benzene molecule 
with optimized Slater-Jastrow (SJ) and configuration interaction sin¬ 
gles doubles-Jastrow (CISDJ) wavefunctions. 

is reduced by the multi-determinantal-Jastrow (CISDJ) wave- 
function we employed; the difference between the variational 
and extrapolated estimator is about 5%. 

The value ofU*/t is found to be extremely sensitive to the 
precise value of the double occupancy correlator, a change of 
a few percent (i.e. from 0.24 to 0.20) changes our estimate 
from 0.3 to 1.3 (i.e. a factor of almost 4). In general, 
this observation suggests that it is crucial to look at various 
other elements of the 2-RDM and to look at alternate ways of 
estimating Hubbard parameters. 

In Fig. 5 we compare results of other correlators from ex¬ 
trapolated QMC estimates with those from the on-site Hub¬ 
bard model on a six site ring for varying values of U* /t. We 
focus on the one-body density matrix (cq ^Ci^a) (the values 
are the same for both spin indices cr), density-density correla¬ 
tors {noUi) and spin-spin correlators {S§Sf), all as a function 
of distance (i) with respect to a reference site (labelled 0). 
The value of U* /t Ri 1.4 gives the smallest errors for most 
observables, except for the nearest-neighbor density-density 
correlator which favors a value of U* /t ~ 0. In the limit that 
the model is perfect, all estimates must yield the same value; 
the differences reflect an inadequacy of the on-site Hubbard 
model in describing all the data. This is evidence for the need 
for long-range interactions. 


2. Hubbard U*/t from the N- AIDMD method 

As mentioned previously, the idea of matching density ma¬ 
trix elements is useful only for comparing exact eigenstates. 
However, it is difficult to construct eigenstates with very high 
accuracy in the ab initio calculations and at times also for the 
equivalent model for large system sizes. This is why we ap¬ 
peal to the N-AIDMD method, introduced and explained in 
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Figure 5. Comparison of correlation functions from ab initio quantum Monte Carlo calculations of the benzene molecule and the six-site 
Hubbard ring for varying values of U*/t. The half-filled ground state has been considered which corresponds to 15 t and 15 4- electrons in 
the ab initio calculation and 3 t and 3 4, electrons in the model calculation. The top panels show the (a) one-body density matrix (cj 
(b) density-density correlators (nom) (c) spin-spin correlators all as a function of distance (i) with respect to a reference site (0). 

Panels (d)-(f) show the same data, but from the point of view of errors of the corresponding model correlation functions with respect to the ab 
initio data. The one-body density matrix shows relatively small errors for all U*/t, but the dependence on U*/t is more pronounced for the 
density-density and spin-spin correlators. We infer that the value U* jt « 1.4 reproduces most correlators well, except the nearest-neighbor 
density-density correlator. 


Sec. IV B, which is relatively insensitive to the nature of the 
states input to the method. For charge-neutral benzene, we 
construct the A matrix by taking various states in a 10 eV en¬ 
ergy window above the ground state using VMC and DMC 
methods. 

Fig. 6 shows the comparison of the fitted energy and the in¬ 
put VMC or DMC energy. The former is obtained by taking 
the linear combination of the ah initio VMC or DMC density 
matrices weighted by the optimized parameters of the effec¬ 
tive Hamiltonian. A perfect agreement between the fitted and 
ab initio input data would correspond to all energies falling 
exactly on the y = x line. By this measure, the Hubbard 
model for benzene is reasonable, though not accurate, as is 
seen in Fig. 6(a) and (d). The presence of significant devia¬ 
tions of the order of 1 — 2 eV from the y = x line indicates 
the need for a more refined model, which we discuss in sec¬ 
tion VIB. 

The VMC data yields optimal parameters of f = 3.0 eV 
and U* = 5.2 eV (U*/t = 1.7) and the FN-DMC data gives 
t = 2.8 eV and U* = 3.9 eV (U*/t = 1.4). The extrapolated 
estimate of the optimal parameters, t = 2.6 eV agrees with 
the value of t = 2.54 eV reported by Bursill et al.^®. The 
extrapolated value of U*/t k, 1.0 is also broadly consistent 
with a recently reported estimate^® to within 10-20%. 


B. Extended Hubbard model 


Having established the need for long-range interactions 
in benzene, we consider the extended-Hubbard or Parisier- 
Pople-Parr (PPP) model. 


H — ^ ^ ^ h.C. -f U ^ ^ 4” ^ ^ 

ij i ij 

(24) 

where U is the on-site Hubbard interaction and tij and Vij 
are inter-orbital hopping and density-density interactions. We 
compare our results with Bursill et al^^, who considered only 
a nearest neighbor hopping and took Vij to be of the Ohno 
form®° 


V.j = 


u 


y/l + {arij)^ 


(25) 


where a is a fit parameter and Vij is the spatial separation be¬ 
tween nuclei. This parameterization has been widely used in 
the modelling of various organic polymers. Here do not make 
any assumptions about the form of the interactions and instead 
use the N-AIDMD method to determine these parameter val¬ 
ues. 

We repeat analyses similar to those for the Hubbard model, 
in addition to carefully looking at the variations in the 1- and 
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Figure 6. Comparison of ab initio (a;-axis) and fitted energies (t/-axis) of the benzene molecule using the N-AIDMD procedure for different 
model Hamiltonians. The non-eigenstate data is generated by considering various spin excitations and nodal surfaces, all in the same charge 
sector (30 electrons). The ab initio energy is directly sampled using quantum Monte Carlo methods - for the variational Monte Carlo method 
(VMC) it corresponds to {iI)t\H\%I}t) /(t/^TIt/'T) and for the fixed-node diffusion Monte Carlo (DMC) to /{%I}t\'4>) where ipT and 

t/) correspond to trial and projected wavefunctions respectively. The fitted energy is obtained from the optimized parameters by multiplying 
them by the corresponding ab initio density matrix elements. The top panels (a)-(c) show the VMC results and the bottom ones (d)-(f) show 
the fixed-node DMC ones, (a) and (d) correspond to the on-site Hubbard model on a six-site ring, (b) and (e) the extended Hubbard model, 
and (c) and (f) include an additional third-nearest neighbor hopping. 


Parameter 

PPP [Ref. 58] 

PPP-VMC 

PPP-DMC 

PPP-Extrap 

U* [Ref. 59] 

[/*-VMC 

[/*-DMC 

t 

2.54 

2.87(1) 

2.76(1) 

2.65(2) 

2.54 

3.04(4) 

2.80(1) 

U 

10.06 

11.95(4) 

10.92(4) 

9.89(6) 

3.04 

5.2(2) 

3.9(1) 

Voi 

7.18 

7.47(5) 

7.13(3) 

6.78(8) 

- 

- 


V 02 

5.11 

5.40(3) 

5.41(2) 

5.40(4) 

- 

- 


Vo3 

4.57 

4.57 

4.57 

4.57 

- 

- 


Ujt 

3.96 

4.16(2) 

3.96(2) 

3.73(3) 

1.20 

1.7(1) 

1.39(5) 


Table I. Model Hamiltonian parameters (in eV) from different downfolding methods, using data from states in the charge neutral sector of 
benzene. Vba sets the constant shift or chemical potential for the interaction terms; its value has been set to match previous semi-empirical fits. 


2-RDM matrix elements. This data has been discussed as part 
of Appendix B and has been shown in Fig. 11. For a parame¬ 
ter to be reliably estimated there should be a large variation in 
the corresponding density matrix element for different wave- 
functions in the low energy space. By this metric, we find 
that the next nearest neighbor hopping fo 2 is irrelevant in the 
charge-neutral sector. We thus attempt to fit to a model only 
with the nearest neighbor t = foi along with U, Vqi and V 02 ', 
Vo 3 is not needed as it simply sets the chemical potential. 


The inadequacies of the on-site Hubbard model, shown in 
Figs. 6(a) and (d), are rectified by the extended one, shown 
in Figs. 6(b) and (e); the maximum energy errors of about 2 
eV are reduced to Ri 0.3 eV. The root mean square errors are 
much smaller as well, reducing from 0.5 eV to about 0.06 eV. 
Adding the next-next nearest neighbor hopping fgs, shown in 
Figs. 6(c) and (f), only marginally improves the accuracy of 
the fit; fo 3 is found to be only about a tenth of the value of toi 
suggesting that its effects can be largely accounted for by for- 


9 



























Figure 7. Panel (a) shows the comparison of experimental energy gaps of the benzene molecule and energy gaps of the eigenstates of the 
extended Hubhard or Parisier-Pople-Parr (PPP) model on a six-site ring using optimized VMC, fixed-node DMC and extrapolated parameters, 
obtained from the N-AIDMD method. The extrapolated parameters remove, to a large extent, the bias from the other two calculations. All 
experimental values and associated error-bars are taken from Bursill et al. Ref.[58], who used these values to fit to a PPP model with density- 
density interactions of the Ohno form in Eq. (25). Panel (b) shows the comparison of experimental energy gaps of the benzene molecule and 
energy gaps of eigenstates for different model Flamiltonians. The on-site Hubbard model and the previous Ohno-parameterized Hamiltonian 
have at least one significant outlier, which is largely remedied by allowing all Vij and U in the PPP model to be varied. 


To assess the accuracy of these parameters, we compare the 
results of the model with experimentally available energies. 
First, as Fig. 7(a) shows for the extended-Hubbard or PPP 
model, the extrapolated parameters give an improved agree¬ 
ment with experiment compared to the VMC or FN-DMC pa¬ 
rameters. Most energy gaps of this model are in excellent 
agreement with available experimental energies, the errors are 
0.2 eV or less. The largest outlier at about 7.5 eV is within 2a 
of the experimental result. 

Next, we compare the experimental energies with energy 
gaps from various model Hamiltonians. While the Hubbard 
and the Ohno parameterizations reproduce most experimental 
gaps, especially at low energies, they have at least one signif¬ 
icant outlier. These outliers are correctly accounted for by the 
fitted PPP Hamiltonian, and improved upon by the introduc¬ 
tion of fo 3 - Owing to the small value of fo 3 , more data in the 
N-AIDMD method may be needed to precisely estimate its 
value. The extreme sensitivity of the high energy eigenstates 
to fo 3 may explain the deviation of the model gap from the 
experimental one at about 7.8 eV. 

We now discuss our parameter values and the errors associ¬ 
ated with them; these have been summarized in Table I. While 
our PPP parameters are generally consistent with the Ohno 
form, there are some differences of the order of 0.3 — 0.5 eV, 
that improve the quality of the fitted energies. We emphasize 
that we have not provided any experimental inputs; rather we 
have used only the QMC data (energies and density matrices) 
from multiple states to obtain the Hamiltonian parameters. 

In order to check the robustness of the fit, we estimated 
errors in our parameters from a Jackknife analysis. In this 
scheme half the input states to the N-AIDMD method were 


randomly discarded and the fit performed with the retained 
half. Many such randomly generated ensembles of input data 
were taken and the resultant parameters were averaged over 
all of these. The difference of the parameters of this reduced 
data set and those obtained from the full data set provides an 
estimate of the systematic error, which we report in table I. 
The other source of error is from statistical noise, which in the 
present case was found to be much smaller than the systematic 
error. 


VII. APPLICATION TO A SOLID: GRAPHENE 

As an application of the AIDMD methods to solid mate¬ 
rials, we consider graphene, a 2D solid of carbon atoms ar¬ 
ranged on a honeycomb lattice. Graphene has great poten¬ 
tial technological applications, which has spurred much work 
devoted to understanding it thoroughly®'. In addition, there 
have been several proposals for engineering exotic phases 
in graphene®^^®"'. That said, it is only recently that system¬ 
atic studies to estimate the role of electron-electron interac- 
tions^®’®®’®® have been carried out. While some of its long¬ 
distance properties appear to be adequately described by a 
tight-binding model, the short range features, crucial for phe¬ 
nomena such as magnetism, require more refined modeling. 

Early studies modelled graphene as a honeycomb lattice 
Hubbard model with aU/t estimated to be Ri 3.8. This would 
put graphene on the verge of a metal-insulator transition®®’®^. 
However, recent results by Wehling et al. realized the im¬ 
portance of long range interactions^®, which renormalize the 
on-site interaction to an effectively lower value. Schuler et 
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Figure 8. Double occupancy correlator of a single vr orbital of the 
L X L (L = 2, 4, 6, 8) honeycomb lattice Hubbard model with 
periodic boundary conditions, as a function of U*/t, computed in 
the half-filled singlet ground state. Comparisons are made with 
the values from variational (VMC), mixed (DMC) and extrapolated 
(Ext) estimators obtained from ab initio QMC calculations on 4 x 4 
graphene using an optimized Slater-Jastrow (SJ) trial wavefunction. 
The cRPA estimate from Ref. [59] is also shown. 

al.^® report the effective C/*/f to be 1.6 ± 0.2, which means 
graphene lies well in the semimetal phase of the honeycomb 
lattice Hubbard model. 

Setting aside the question of determining all the long range 
interactions in graphene, we ask what U*/t best describes our 
ground state QMC data. To do so, we first generated the rr-like 
Wannier functions within QWalk^**, a representative of which 
has been shown in Fig. 2. 

Just as in the case of benzene, we used the fact that the ef¬ 
fective strength of the Coulomb interaction U*/t is most sen¬ 
sitive to the 2-RDM element For the 4x4 unit cell 

with periodic boundary conditions, and using optimized Slater 
Jastrow wavefunctions, the extrapolated value is found to be 
0.221(5) corresponding to a U*/t ~ 1-1(1). This estimate 
of U*/t is obtained from comparisons to lattice determinantal 
QMC calculations, which were carried out for sizes ranging 
from 2 X 2 to 8 X 8 to check for finite-size effects. As Fig. 8 
shows, the 2x2 unit cell is distinctly different from the larger 
unit cells and the finite-size errors in the double occupation 
correlator are negligible beyond sizes L > 4. The finite size 
effects for other short range correlation functions (not shown) 
are also negligible beyond L> A. 

We also calculated many non-eigenstates for 3x3 graphene 
in an energy window 3 eV above the ground state. Fig. 9 
shows the VMC and FN-DMC energy fits to tight-binding and 
Hubbard models. The hopping t for the tight-binding model 
is found to be in the range of 2.2 to 2.8 eV, as is indicated in 
Figs. 9(a) and (c). However, a precise estimate of this param¬ 
eter is not particularly meaningful because the model is in¬ 
adequate at capturing many states, particularly spinful excita¬ 
tions, in this energy window. We note that a value off = 2.80 
eV^® has been previously calculated with DFT methods. 

Figs. 9(b) and (d) show that the Hubbard model reduces 


the errors of the tight-binding model, and the value of U*/t 
is found to be in the range of 1.9 (VMC) to 1.3 (FN-DMC). 
The latter estimate is expected to be more accurate and hence 
closer to the true value of [/*/f in a small energy window 
associated with the ground state. We also note that this value 
is within 2a of the value derived from the constrained RPA 
parameters'^. However the Hubbard model too has outliers of 
about 0.4 eV, which are large for an accurate model of a solid 
material. 

This inadequacy is confirmed by assessing various corre¬ 
lators in the half filled ground state. Fig. 10 shows the up 
density-up density and up density- down density correlators 
as a function of distance between carbon atoms. On the scale 
of Fig. 10(a) (and well within the accuracy of our calculations) 
the like spin correlations were captured well for all values of 
U*/t in the range from 0 to 2. However, as Fig. 10(b) shows, 
the Hubbard model for large U*/t tends to exaggerate the the 
effective interaction between the two electron spin flavors at 
small separations. In particular, the nearest neighbor unlike 
spin density-density correlator is found to be in better agree¬ 
ment with U*/t ~ 0 than any finite value. This, just like the 
case of benzene, suggests the need for longer range interac¬ 
tions in the model. There are also small deviations between 
the ab initio QMC and the Hubbard model results at longer 
distances. These correlations do not depend significantly on 
U*/t and these data in isolation do not rule out any model. 


VIII. CONCLUSION 

We have demonstrated ab initio density matrix downfold¬ 
ing (AIDMD) methods where ab initio quantum Monte Carlo 
(QMC) data is used to fit simple effective Hamiltonians. We 
have elaborated on the fitting procedures and the intricacies of 
the QMC method needed to perform calculations. The limi¬ 
tations of the model were judged by assessing the quality of 
the fitted energies and 2-body density matrices. This feature is 
useful for constructing refined models needed for the accurate 
simulation of real materials. 

For the benzene molecule, while the on-site Hubbard model 
with U*/t K, 1.2 ± 0.2 was able to capture most features of 
the QMC ground state data, the deviations of the density ma¬ 
trices revealed the need for longer range interactions. Includ¬ 
ing these interactions improved the agreement of the model 
with both the QMC results and the experimental data. This 
effective Hamiltonian parameterization could be used to cal¬ 
culate low-frequency response functions and to check semi- 
empirical methods. 

Since QMC calculations use size-consistent wavefunctions 
for extended systems and scale favorably, we believe the type 
of calculations presented here will be a promising alterna¬ 
tive to DFT-based downfolding approaches for solid materi¬ 
als. Our demonstration for the single band model of graphene 
yielded an effective U*/t = 1.3 ± 0.2, in the same range 
as a recently reported estimate based on the constrained-RPA 
method^®. We leave a more detailed characterization of inter¬ 
actions in graphene to future work. 

In more complicated materials, where the form of the 
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Figure 9. Comparison of ab initio (a;-axis) and fitted energies (y-axis) of the 3x3 periodic unit cell of graphene, using the N-AIDMD procedure. 
The non-eigenstate data is generated hy considering various spin excitations and nodal surfaces, all in the same charge sector (72 electrons). 
The ab initio energy is directly sampled using quantum Monte Carlo methods - for the variational Monte Carlo method (VMC) it corresponds 
to {iPtIHIiIit)/{ ipTltltT) and for the fixed-node diffusion Monte Carlo (DMC) to (i/jtITT|' i/))/(^T|t/>) where i/jt and tp correspond to trial and 
projected wavefunctions respectively. The fitted energy is obtained from the optimized parameters by multiplying them by the corresponding 
ab initio density matrix elements. The top panels (a),(b) show the VMC results and the bottom ones (c),(d) show the fixed-node DMC ones, 
(a) and (c) correspond to the tight binding model on a honeycomb lattice, and (b) and (d) correspond to the on-site Hubbard model. 


Hamiltonian is unclear, we suggest that the dominant terms 
can be obtained from canonical transformation theory fol¬ 
lowed with an accurate fit to the QMC data. In this spirit, 
it will also be useful to compare the predictions of the pro¬ 
posed AIDMD schemes with other complementary proposals 
for downfolding'"^’®*'®. 

Finally, we remark that previously unsolved model Hamil¬ 
tonians are now being accurately treated with tensor network 


methods^**’^*. With parallel advances in the ab initio QMC 
simulation of high-temperature superconductorsa clear 
future direction is to deduce more refined models for these 
compounds, using the ideas discussed in the paper. 
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Figure 10. Comparison of correlation functions of the half-filled, unpolarized ground state of 4 x 4 periodic cell of graphene from ab initio 
quantum Monte Carlo with those from determinantal quantum Monte Carlo calculations of the honeycomb lattice on-site Hubbard model for 
various values of U* jt. Panel (a) shows the up density - up density correlator (no.t^ht)) shows the up density - down density correlator 

(no.fni,^) for the neighbor of a reference site (labelled 0). The like-density correlators depend weakly on U*/t on the scale shown for 
all i. The correlator (no.^noa) suggests U*/t « 1.2, but the deviation for (no,tni,|) indicates that the Hubbard model overestimates the 
nearest-neighbor attraction between electrons of opposite spins. 
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X. APPENDIX A: ESTIMATION OF TRUE 
EFFECTIVE HAMILTONIAN PARAMETERS FROM 
VMC AND FN-DMC CALCULATIONS 

In section IV D, we discussed the discrepancy between pa¬ 
rameters obtained from VMC and EN-DMC. This is attributed 
to the inability of the trial wavefunctions used in VMC to pro¬ 
vide an accurate description of the core and virtual spaces. 

As mentioned in the text, ideally, only the EN-DMC calcu¬ 
lations should be used to estimate the parameters. However, 
the mixed estimator in EN-DMC is biased because of the in¬ 
accuracy of the trial wavefunction. To minimize this bias, we 
propose estimators which combine EN-DMC and VMC pa¬ 
rameters. 

Consider a trial VMC wavefunction ipT which deviates 
from the DMC wavefunction -0 by a small amount 6iph or¬ 
thogonal to ipT i-e., 


IV') = IV't) + S\ilJh) (26) 


To obtain the parameter p coupled to an operator pp in the 
effective Hamiltonian i.e. 

H = J2pPp ( 27 ) 

p 


we take partial derivatives with respect to the change of a den¬ 
sity matrix element i.e. 


d{i>\PpW 


(28) 


Since the pure estimators for projected (DMC) wavefunctions 
are not easily evaluated in QMC, we use other estimators us¬ 
ing which we indirectly obtain p. 

The parameter obtained from the mixed estimators within 
EN-DMC, is formally defined as, 


dj'fplHltpT) 

d{it\Pp\’(pT) 


(29) 


On substituting the relation between t/; and ipT wavefunctions, 
we get. 


^ f d{ilj\H\tj}) \ / _ d{tj}\H\'tljh) \ / _ d(t/jjppjtM] ^ 

\d{'^\Pp\'^)) V d{'il^\H\tP) ) \ ) 

(30) 


which to linear order in 5 is, 


PD 


p{l + 6 


V d{ii)\pp\tlj) 


d{'ip\H\tl]) j j 


( 31 ) 


A similar expression for parameters obtained from VMC 
calculations, 


9('0T|Pp|tAT) 


(32) 
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is derived and leads to the result. 


Pv 


pil + 26 


( d{'<p\pp\il^h) 




(33) 


where we have used the hermiticity of the Hamiltonian H and 
the operator pp. 

On combining equations (31) and (33), to leading order in 
S we get. 


p = 2pD - Pv + 0{S'^) (34) 

As expected, all estimates are consistent with each other in the 
limit of exact wavefunctions. 


XI. APPENDIX B: QMC DATA FOR THE BENZENE 
MOLECULE 


trow form, we changed the determinantal coefficients to gen¬ 
erate new wavefunctions. We checked the 1-RDM to make 
sure that it had the correct total electron number (6 elec¬ 
trons) on the localized orbitals that constitute our active space. 
Moreover, if the change in determinantal coefficients led to an 
energy-average outside our pre-decided energy window, the 
new state was discarded from the N-AIDMD procedure. 

For the N-AIDMD method, we desire large variations in 
the density matrix elements for different wavefunctions in the 
low-energy space, in order to accurately estimate the Hamilto¬ 
nian parameters. Fig. 11 shows these variations for all relevant 
density-matrix elements (within FN-DMC) that were needed 
for estimating the parameters of the extended Hubbard model. 
to 2 was found to be irrelevant, as the summed density matrix 
elements coupling to it were found to not vary significantly 
(not shown in the plot). 


Table II shows the QMC data for several eigenstates of 
benzene. Most of these states along with many other non¬ 
eigenstates constitute our data set for fitting a model. 

The calculations confirm the general expectation that sig¬ 
nificant energy gains are obtained by improving wavefunc¬ 
tions going from the single-Slater-Jastrow form to the multi- 
determinantal-Jastrow form (the determinants being selected 
from a CISD calculation). Moreover, the DMC calculations 
improve total energies significantly; typically, the DMC val¬ 
ues are 2 — 3 eV lower than the corresponding VMC value. 

The total electron count from the one-body density matrix 
is assessed to verify the validity of fitting to a six-7r orbital 
Hamiltonian. Table II shows that for charge-neutral benzene, 
the singlet state (S = 0) has up and down electron counts of 
2.96 each, which are close to the expected values of 3,3. For 
the 5 = 2 state, roughly 9 eV above the ground state, the de¬ 
viations were slightly larger; the summed occupation numbers 
were 4.92 and 1.0 in comparison to the expected values of 5 
and 1. Since there is a slight deviation of these numbers from 
integers, we rescale the one and two body density matrices by 
factors (all slightly greater than 1) that correctly accounts for 
sum rules for each individual state used in the AIDMD meth¬ 
ods. 

However, for the 5 = 3 state, the electron occupation num¬ 
bers deviate significantly from the corresponding value in the 
model; almost by one integer. This indicates that the 5 = 3 
state is inadequately described by the proposed Hamiltonian 
and hence can not be used in the fitting procedure. This de¬ 
viation is not completely unexpected since this state is ^ 17 
eV above the ground state and a potential high-energy state. 
Said differently, the active space at this energy scale is con¬ 
siderably different from that assumed for the ground state and 
its low-energy excitations. Thus, this QMC data suggests that 
it is only reasonable for the effective Hamiltonian concept to 
hold only in an energy window of the order of 10 eV above 
the ground state. 

We now assess some aspects of our non-eigenstate data. As 
mentioned in the main text, heuristics were used to construct 
these states. For example, for every near-eigenstate in a sym¬ 
metry sector that was represented by multi-determinantal Jas- 
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DFT 

SJ-VMC 

SJ-DMC 

CISDJ-VMC 

CISDJ-DMC 

N^,N^ 

Used in Fit? 

0 
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2.96,2.96 

Yes 

1 
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-37.5707(7) 

3.94,1.98 

Yes 
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3.94,1.98 

Yes 





-37.4531(6) 
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Yes 

2 

-37.3203 
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Yes 
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-37.0378 
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-37.1074(7) 

-37.0118(4) 

-37.1083(7) 

4.88,0.02 

No 


Table II. Energy of different spin eigenstates in the charge-neutral sector of benzene from DFT and QMC methods. SJ refers to the Slater 
lastrow wavefunctions. CISDJ refers to multi-determinantal Jastrow wavefunctions where the determinants are obtained from a CTsingles 
doubles calculations and their weights optimized within VMC. All DMC calculations used a time step of 0.01 Ha“^. The states that had 
significantly different occupations from the expected values were not used in the fitting as they can not be described by an effective Hamiltonian 
involving only six tt orbitals. 



Figure 11. Variation of density matrix elements for various states of the benzene molecule used in the N-AIDMD method. The summation 
sign indicates a sum over all RDM elements that couple to the same parameter in the lattice model. The panels show (a) the sum over all the 
nearest neighbor one body density matrix elements summed over both spin types and (b) the sum over on-site double occupancies, (c) and (d) 
show sums over nearest and next-nearest neighbor density-density correlators respectively. As is discussed in the text, variations in the density 
matrix elements are needed for the corresponding parameters to be estimated, else they can be taken to be effectively zero. 
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